tutorials-and-examples/how-tos/Provisioning DSVM.ipynb (230 lines of code) (raw):
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# How To: Provisioning Data Science Virtual Machine (DSVM)\n",
"\n",
"__Notebook Version:__ 1.0<br>\n",
"__Python Version:__ Python 3.6 (including Python 3.6 - AzureML)<br>\n",
"__Required Packages:__ azure 4.0.0, azure-cli-profile 2.1.4<br>\n",
"__Platforms Supported:__<br>\n",
" - Azure Notebooks Free Compute\n",
" - Azure Notebooks DSVM\n",
"__Data Source Required:__<br>\n",
" - no\n",
" \n",
"### Description\n",
"The sample notebook shows how to provision a Azure DSVM as an alternate computing resource for hosting Azure Notebooks.\n",
"\n",
"Azure Notebooks provides Free Compute as the default computing resource, which is free of charge. However, sometimes you do want to have a powerful computing environment, and you don't want to go through Direct Compute route which requires JupyterHub installation on Linux machines, then Data Science Virtual Machine (DSVM) becomes a vital choice.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Table of Contents\n",
"\n",
"1. How to create a new DSVM \n",
"2. How to use DSVM\n",
"3. Things to know about using DSVM"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. How to create a new DSVM"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Using the Azure Portal\n",
"\n",
"Follow <a href='https://docs.microsoft.com/azure/notebooks/use-data-science-virtual-machine' target='_blank'>this article</a> for details.\n",
"\n",
"You should select Linux Ubuntu DSVM. And keep in mind that on Azure DSVM, if you want to use Python 3.6 which is required by Microsoft Sentinel notebooks, you need to <font color=red> select Python 3.6 - AzureML.</font>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Using the Azure CLI"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# only run once\n",
"!pip install --upgrade Azure-Sentinel-Utilities"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# please enter your tenant domain below, for Microsoft, using: microsoft.onmicrosoft.com\n",
"!az login --tenant ''"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# User Input for creating a new DSVM\n",
"vm_size = 'Standard_DS3_v2'\n",
"\n",
"# replace [[your_subcription_id]] with 'real subscription id'\n",
"!az account set --subscription [[your_subcription_id]]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# replace all [[your_stuff]] with 'real values'\n",
"!az group deployment create \\\n",
" --resource-group [[your_subcription_id]] \\\n",
" --template-uri https://raw.githubusercontent.com/Azure/DataScienceVM/master/Scripts/CreateDSVM/Ubuntu/azuredeploy.json \\\n",
" --parameters \\\n",
" '{ \\\n",
" \"$schema\": \"https://schema.management.azure.com/schemas/2015-01-01/deploymentParameters.json#\",\\\n",
" \"contentVersion\": \"1.0.0.0\",\\\n",
" \"parameters\": {\\\n",
" \"adminUsername\": { \"value\" : \"[[your_admin_id]]\"},\\\n",
" \"adminPassword\": { \"value\" : \"[[your_admin_password]]\"},\\\n",
" \"vmName\": { \"value\" : \"[[vm_name]]\"},\\\n",
" \"vmSize\": { \"value\" : \"Standard_DS3_v2\"}\\\n",
" }\\\n",
" }'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"*** Please go to the project page to select the VM that you just created as your new computing platform (Run on ...) to continue ..."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. How to use DSVM\n",
"\n",
"1. Now that you have a DSVM, when you login to https://notebooks.azure.com, you can see you DSVM on the drop down list under Free Compute and Direct Compute.<br>\n",
"<br>\n",
"2. Of course you will select DSVM, it will ask you to validate your JIT credentials.<br>\n",
"<br>\n",
"3. Once you pick a notebook to run, you may encounter the following warning:<br>\n",
"<br>\n",
"As you may see, [Python 3.6 - AzureML] is the correct answer.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Things to know about using DSVM\n",
"\n",
"1. The most important thing to know about Azure Notebooks on DSVM is that: Azure Notebooks project home directory is not mounted on the DSVM. So any references to Azure Notebooks folders / files will incur File/folder not found exception. In other words, each ipynb notebook need to be independent of other files. \n",
"2. There are work-around solutions:<br>\n",
" a. Data files can be stored on Azure Blob storage and <a href='https://github.com/Azure/azure-storage-fuse' target='_blank'>blobfufe</a><br>\n",
" b. Python files can be added to the notebook by using the Jupyter magic, you can find an example here: <a href='https://github.com/Microsoft/connect-petdetector/blob/master/setup.ipynb' target='_blank'>%%writefile</a><br>\n",
" c. Configuration files are a bit more complicated. Using our Microsoft Sentinel config.json as an example, it is generated when you import Microsoft Sentinel Jupyter project from GitHub repo through Azure portal. The configuration JSON is Azure Log Analytics workspace specific file, so you clone one project for one Log Analytics workspace. You can find the config.json file at the root of the project home directory. <a href='https://orion-zhaozp.notebooks.azure.com/j/notebooks/Notebooks/Get%20Start.ipynb' target='_blank'>Get Start.jpynb</a> section 1 demonstrates how to set the configuration settings manually. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"hide_input": false,
"kernelspec": {
"name": "python36",
"display_name": "Python 3.6",
"language": "python"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": false,
"sideBar": true,
"skip_h1_title": false,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
},
"varInspector": {
"cols": {
"lenName": 16,
"lenType": 16,
"lenVar": 40
},
"kernels_config": {
"python": {
"delete_cmd_postfix": "",
"delete_cmd_prefix": "del ",
"library": "var_list.py",
"varRefreshCmd": "print(var_dic_list())"
},
"r": {
"delete_cmd_postfix": ") ",
"delete_cmd_prefix": "rm(",
"library": "var_list.r",
"varRefreshCmd": "cat(var_dic_list()) "
}
},
"types_to_exclude": [
"module",
"function",
"builtin_function_or_method",
"instance",
"_Feature"
],
"window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 1
}